A Hybrid Implementation of K-Means and HAC Algorithm and Its Comparison with other Clustering Algorithms

نویسندگان

Anita Ganpati

Jyoti Sharma

چکیده

There is a huge amount of data which is being produced everyday in Information Technology industry but it is of no use until converted into useful information. Data mining is defined as the process of extracting of hidden predictive information from large databases. Data mining provides an easy and timesaving concept to extract the useful information from large database instead of going through the whole database. There are various data mining techniques and clustering is one of them. Clustering algorithms especially draws significant attention of researchers all around the world because it makes an easy availability of the same data in form of clusters. There are various types of clustering algorithms available in the literature, with each algorithm having its own pro and cons. In this research paper, a hybrid implementation of k-Means and HAC clustering algorithm is presented. Also, the hybrid approach is compared with four other clustering algorithm namely k-Means, DT, HAC, VARCHA. The hybrid implementation has been done using Python scripting language and SCIKIT LEARN open source tool was used for the performance comparison of the algorithms. The various parameters used for comparison were accuracy, precision, recall and f-score. The results show that the performance of hybrid algorithm is found to be quite better than the existing ones.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS

Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper...

متن کامل

Improved COA with Chaotic Initialization and Intelligent Migration for Data Clustering

A well-known clustering algorithm is K-means. This algorithm, besides advantages such as high speed and ease of employment, suffers from the problem of local optima. In order to overcome this problem, a lot of studies have been done in clustering. This paper presents a hybrid Extended Cuckoo Optimization Algorithm (ECOA) and K-means (K), which is called ECOA-K. The COA algorithm has advantages ...

متن کامل

Tabu-KM: A Hybrid Clustering Algorithm Based on Tabu Search Approach

The clustering problem under the criterion of minimum sum of squares is a non-convex and non-linear program, which possesses many locally optimal values, resulting that its solution often falls into these trap and therefore cannot converge to global optima solution. In this paper, an efficient hybrid optimization algorithm is developed for solving this problem, called Tabu-KM. It gathers the ...

متن کامل

Data Clustring Using A New CGA(Chaotic-Generic Algorithm) Approach

Clustering is the process of dividing a set of input data into a number of subgroups. The members of each subgroup are similar to each other but different from members of other subgroups. The genetic algorithm has enjoyed many applications in clustering data. One of these applications is the clustering of images. The problem with the earlier methods used in clustering images was in selecting in...

متن کامل

Data Clustring Using A New CGA(Chaotic-Generic Algorithm) Approach

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

A Hybrid Implementation of K-Means and HAC Algorithm and Its Comparison with other Clustering Algorithms

نویسندگان

چکیده

منابع مشابه

A Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS

Improved COA with Chaotic Initialization and Intelligent Migration for Data Clustering

Tabu-KM: A Hybrid Clustering Algorithm Based on Tabu Search Approach

Data Clustring Using A New CGA(Chaotic-Generic Algorithm) Approach

Data Clustring Using A New CGA(Chaotic-Generic Algorithm) Approach

عنوان ژورنال:

اشتراک گذاری